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1 Introduction 

STEPparse,  the  NIST  STEP  physical  file  parser,  and  the  associated  STEP  Working 
Form,  are  Public  Domain  tools  for  manipulating  product  models  stored  in  the  STEP 
physical  file  format  [Altemueller88].  These  tools  are  a part  of  the  NIST  PDES  Toolkit 
[Clark90a],  and  are  geared  particularly  toward  building  STEP  translators.  The  STEP 
Working  Form  is  an  in-memory  representation  for  STEP  product  models.  It  relies  on 
the  NIST  Express  Working  Form  [Clark90b]  as  an  in-core  data  dictionary,  which  pro- 
vides a context  in  which  STEP  models  can  be  interpreted.  The  Working  Form  code  and 
the  STEPparse  parser  itself  are  both  written  to  be  independent  of  any  particular  schema: 
simply  plug  in  some  Express  language  information  model  [Schenck90],  and  the  code  is 
ready  to  run. 

A primary  goal  in  the  development  of  STEPparse  was  to  provide  a clean  back-end  in- 
terface which  would  allow  various  output  modules  to  be  easily  plugged  into  the  basic 
front-end  parser.  To  accomplish  this,  the  parser  builds  up  a set  of  data  structures  (the 
STEP  Working  Form)  containing  all  of  the  information  in  a STEP  source  file.  It  can 
then  dynamically  load  one  or  more  output  modules.  Each  module  walks  through  the 
Working  Form,  extracting  relevant  subsets  of  the  available  data  and  producing  an  ap- 
propriately formatted  output  file.  Three  STEPparse  output  modules  are  provided  with 
the  NIST  PDES  Toolkit:  one  which  produces  Smalltalk-80™  object  instantiations,  one 
which  produces  a STEP  physical  file  (so  the  the  Working  Form  can  be  used  to  translate 
to  as  well  as  from  STEP),  and  one  which  loads  an  SQL  database  from  the  STEP  Work- 
ing Form  [Nickerson90],  The  former  is  used  by  QDES  [Clark90d],  a prototype  STEP 
model  editor  written  in  Smalltalk-80. 

1.1  Context 

The  PDES  (Product  Data  Exchange  using  STEP)  activity  is  the  United  States’  effort  in 
support  of  the  Standard  for  the  Exchange  of  Product  Model  Data  (STEP),  an  emerging 
international  standard  for  the  interchange  of  product  data  between  various  vendors’ 
CAD/CAM  systems  and  other  manufacturing-related  software  [Smith88].  A National 
PDES  Testbed  has  been  established  at  the  National  Institute  of  Standards  and  Technol- 
ogy to  provide  testing  and  validation  facilities  for  the  emerging  standard.  The  Testbed 
is  funded  by  the  CALS  (Computer-aided  Acquisition  and  Logistic  Support)  program  of 
the  Office  of  the  Secretary  of  Defense.  As  part  of  the  testing  effort,  NIST  is  charged 
with  providing  a software  toolkit  for  manipulating  PDES  data.  This  NIST  PDES  Tool- 
kit is  an  evolving,  research-oriented  set  of  software  tools.  This  document  is  one  of  a set 
of  reports  which  describe  various  aspects  of  the  Toolkit.  An  overview  of  the  Toolkit  is 
provided  in  [Clark90a],  along  with  references  to  the  other  documents  in  the  set. 
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2 Implementation  Environment 

STEPparse  and  the  STEP  Working  Form  were  developed  on  Sun  Microsystems  Sun- 
3™  and  Sun-4™  workstations  running  the  Unix™  operating  system.  The  parser  is  im- 
plemented in  Yacc  and  Lex,  the  Unix  tools  for  generating  parsers  and  lexical  analyzers. 
The  Working  Form  data  structures  are  implemented  in  ANSI  Standard  C [ANSI89]. 

The  grammar  for  the  language  is  processed  by  Bison,  the  Free  Software  Foundation’s1 2 
implementation  of  the  Yacc  parser  generator.  The  lexical  analyzer  is  produced  by 

Flex"-,  a fast,  public  domain  implementation  of  Lex.  The  C compiler  used  is  GCC,  also 
a product  of  the  Free  Software  Foundation,  although  the  Working  Form  code  does  not 
specifically  depend  on  any  particular  compiler. 

3 Running  STEPparse 

STEPparse  takes  several  optional  command-line  arguments: 

STEPparse  [-d  <number>] 

[-e  <express>] 

[-s  <step>] 

The  -d  option  controls  the  debugging  level;  the  argument  can  range  from  0 (the  de- 
fault) to  10.  The  Express  schema  file  is  specified  with  -e;  if  no  -e  option  is  given,  the 
schema  is  read  from  standard  input.  The  STEP  input  file  is  specified  with  -s;  again, 
standard  input  is  read  if  there  is  no  -s  option.  At  least  one  of  -e  or  -s  must  be  spec- 
ified; STEPparse  cannot  read  both  from  standard  input. 

STEPparse  can  be  built  in  two  different  ways,  resulting  in  different  interaction  patterns. 
For  many  applications,  a single  output  module  is  bound  into  the  translator  at  build  time. 
In  this  statically  linked  case,  after  the  STEP  source  file  has  been  parsed  the  user  is  nor- 
mally prompted  for  a single  file  name.  This  is  the  name  of  the  file  to  which 
STEPparse’s  output  will  be  written.  In  the  other  (dynamically  linked)  version,  no  spe- 
cific output  module  is  loaded  at  build  time.  In  this  case,  after  parsing  its  input,  the  pro- 
gram asks  for  an  output  module.  If  the  file  named  is  an  appropriate  object  file,  it  is 
loaded  and  an  output  file  name  requested,  which  is  where  the  output  will  be  written. 
Another  output  module  is  then  requested,  and  this  sequence  continues  until  the  user  en- 
ters an  empty  line  as  the  name  of  the  output  module,  which  signals  STEPparse  to  exit. 
This  dynamic  loading  facility  is  only  available  under  BSD4.2  Unix  and  its  derivates. 


1.  The  Free  Software  Foundation  (FSF)  of  Cambndge,  Massachusetts  is  responsible  for  the  GNU  Project, 
whose  ultimate  goal  is  to  provide  a free  implementation  of  the  Unix  operating  system  and  environment. 
These  tools  are  not  in  the  Public  Domain:  FSF  retains  ownership  and  copyright  privileges,  but  grants  free  dis- 
tribution rights  under  certain  terms.  At  this  writing,  further  informauon  is  available  by  electronic  mail  on  the 
Internet  from  gnu@prep.ai.mit.edu. 

2.  Vem  Paxson’s  Fast  Lex  is  usually  distributed  with  GNU  software.  It  is,  however,  in  the  Public  Domain, 
and  is  not  an  FSF  product.  Thus,  it  does  not  come  under  the  FSF  licensing  restrictions. 
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4 Design  Overview 

The  STEP  Working  Form  (WF)  is  designed  in  an  object-oriented  fashion,  and  is  intend- 
ed to  mesh  cleanly  with  the  NIST  Express  Working  Form.  Indeed,  the  WF  currently 
relies  on  the  structures  of  the  Express  Working  Form  as  an  in-memory  data  dictionary. 
This  section  discusses  the  design  of  the  Working  Form,  describing  STEPparse  control 
flow  as  well  as  the  data  abstractions  of  the  WF.  More  technical  detail  can  be  found  in 
[Clark90c], 

4.1  STEPparse  Control  Flow 

A STEPparse  translator  consists  of  two  separate  passes:  parsing  and  output  generation. 
The  first  pass  builds  an  instantiated  product  model  from  a STEP  source  file.  This  model 
can  then  be  traversed  by  an  output  module  in  the  second  pass,  producing  whatever  re- 
port is  desired. 

As  currently  implemented,  STEPparse  must,  in  fact,  parse  an  Express  schema  (with 
Fed-X)  before  it  can  interpret  the  constructs  in  a STEP  physical  file.  To  do  this, 
STEPparse  first  invokes  Fed-X ’s  first  two  passes  to  build  a data  dictionary,  and  then 
proceeds  to  parse  its  STEP  source  file. 

4.2  Working  Form  Data  Structures 

The  STEP  Working  Form  consists  of  two  data  abstractions.  The  Instance  abstraction 
represents  individual  entity  instances  in  a product  model,  as  well  as  aggregates  and  un- 
structured values  (integers,  booleans,  etc.).  A more  object-oriented  design  would  clear- 
ly break  these  down  into  several  separate  subclasses  of  Instance;  implementation 
considerations  have  resulted  in  a single  module.  The  second  abstraction  represents  a 
complete  product  model.  This  basically  consists  of  an  ordered  collection  of  Instances 
and  an  Express  model  to  give  it  a context.  The  Working  Form  currently  does  not  record 
header  information  (as  found  in  STEP  physical  files),  although  this  would  certainly  be 
useful. 

4.2.1  Instance 

As  mentioned  above,  the  Instance  abstraction  is  really  the  union  of  several  other  con- 
ceptual classes,  representing  entity  instances,  simple-typed  data  (integer  values,  bool- 
eans, etc.),  and  various  kinds  of  aggregates.  Most  of  the  access  functions  for  this 
abstraction  are  restricted  to  act  on  Instances  of  certain  classes,  which  indicates  very 
clearly  the  need  for  this  module  to  be  broken  down  into  its  component  classes;  this  has 
not  been  done,  primarily  because  of  limitations  of  the  implementation  language,  C. 

Certain  attributes  are  common  to  all  Instances.  For  example,  each  instance  is  marked 
with  a Type,  which  determines  the  context(s)  in  which  it  can  be  used.  The  Type  also 
provides  an  interpretation  for  the  Instance’s  value.  A user  data  field  is  provided  so  that 
an  arbitrary  C pointer  can  be  associated  with  each  Instance  in  an  instantiated  model. 

This  allows  a Working  Form  model  to  be  linked  to  the  internal  data  structures  of  a solid 

modeler,  for  example. 
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Additionally,  every  Instance  has  a value  field.  The  type  of  this  field  varies  widely  with 
the  type  of  the  Instance,  but  there  are  three  primary  classes:  simple  (unstructured)  val- 
ues, aggregates,  and  entity  instances.  Examples  of  Instances  with  simple  values  are 
numbers,  strings,  and  booleans.  These  instances  each  have  a single,  atomic  value  of  the 
corresponding  C type  (int,  char*,  etc.).  An  aggregate  (which  may  be  an  array,  bag, 
list,  or  set)  consists  of  a collection  of  values,  each  of  which  is  itself  an  Instance.  These 
elements  can  be  accessed  via  indexing,  with  valid  indices  ranging  between  lower  and 
upper  bounds  specified  by  the  Express  model.  Note  that  these  bounds  are  interpreted 
differently  for  different  classes  of  aggregates  in  Express.  The  bounds  directly  specify 
the  range  of  allowable  indices  for  an  array,  while  they  limit  the  size  of  other  aggregate 
types.  Thus,  indices  for  list,  bags,  and  sets  range  from  0 to  the  current  size  of  the  ag- 
gregate, which  in  turn  must  fall  between  the  upper  and  lower  bounds.  The  Express  lan- 
guage also  specifies  type-specific  operations  for  each  class  of  aggregates,  such  as 
intersection  and  union  of  sets  and  bags,  and  list  concatenation.  These  operations  are 
provided  by  the  STEP  WF  as  the  preferred  mode  of  interaction  with  aggregate  Instanc- 
es. An  entity  instance’s  value  again  consists  of  a collection  of  Instances.  These  are  ac- 
cessed by  name,  using  the  attribute  names  from  the  entity’s  class. 

Finally,  an  Instance  may  have  a name.  Normally,  only  external  (non-embedded)  enti- 
ties will  be  named;  all  other  Instances  will  have  NULL  names.  This  is  due  to  the  usage 
prescnbed  by  the  STEP  Physical  File  format:  an  embedded  entity  cannot  be  referenced 
outside  of  the  immediate  context  in  which  it  is  defined,  and  so  has  no  need  for  a name. 
An  external  entity,  on  the  other  hand,  can  be  referenced  by  any  other  entity  in  a product 
model.  This  reference  requires  the  entity’s  name  as  a handle. 

4.2.2  Product 

The  Product  abstraction  ties  things  together  in  the  STEP  Working  Form.  This  module 
is  used  to  represent  a STEP  product  model  as  a whole.  A Product  consists  of  a collec- 
tion of  (presumably  interconnected)  Instances  and  an  Express  conceptual  schema  to 
give  these  Instances  context.  This  schema  serves  as  a data  dictionary  for  the  Product. 
External  entities  in  a Product  can  be  looked  up  by  name;  other  Instances  can  only  be 
retrieved  by  coming  upon  them  as  components  of  known  Instances. 

Externally,  a Product  looks  like  a somewhat  intelligent  container  object.  New  Instances 
can  be  added  to  this  container,  and  existing  Instances  can  be  retrieved  from  it  by  name. 
Additional  functionality  can  be  gained  from  the  attached  Express  information  model. 

5 Missing  Features 

Currently,  the  Working  Form  does  not  handle  user-defined  entities.  STEPparse  accepts 
user-defined  entities  in  a source  file,  and  prints  a warning  message  indicating  that  they 
cannot  be  represented  in  the  Working  Form. 

As  mentioned  above,  file  header  information  from  PDES/STEP  physical  files  is  not  re- 
tained in  the  Working  Form,  although  STEPparse  silently  accepts  file  headers. 
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Aggregates  with  non-constant  expressions  as  bounds  are  not  handled  properly.  Such 
an  aggregate’s  type  information  accurately  reflects  the  true  upper  bound,  but  the  STEP 
WF  routines  treat  the  bound  as  if  it  were  unspecified.  Since  unbounded  aggregates  are 
dynamically  sized,  this  does  not  cause  memory  management  problems;  the  only  draw- 
back is  that  the  Working  Form  does  not  enforce  size  constraints  on  such  aggregates. 

Comments  are  currently  discarded  during  lexical  analysis,  and  so  currently  have  no 
chance  to  be  recorded  by  the  parser.  There  has  been  some  interest  in  developing  a 
mechanism  through  which  applications  which  modify  STEP  physical  files  can  preserve 
comments  found  in  the  input  file. 

6 Conclusion 

The  combination  of  the  STEP  Working  Form  with  an  Express  Working  Form  data  dic- 
tionary provides  a flexible  mechanism  for  performing  various  manipulations  of  STEP 
data  in  a schema-independent  manner.  Although  it  remains  to  be  seem  how  useful  this 
schema- independence  will  be  in  higher-level  end-user  applications  (e.g.,  design  editors, 
configuration  management  systems,  and  process  planning  systems),  the  present  archi- 
tecture is  quite  useful  for  such  generic  tasks  as  translation  and  database  loading. 

For  further  information  on  STEPparse,  the  STEP  Working  Form,  or  other  components 
of  the  Toolkit,  or  to  obtain  a copy  of  the  software,  use  the  attached  order  form. 
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